AITopics | perceptual representation

Collaborating Authors

perceptual representation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PALMER: Perception-Action Loop with Memory for Long-Horizon Planning

Neural Information Processing SystemsApr-28-2026, 02:54:09 GMT

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

PALMER: Perception - Action Loop with Memory for Long-Horizon Planning

Neural Information Processing SystemsDec-25-2025, 11:47:05 GMT

perceptual representation, planning algorithm, representation, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.85)

Add feedback

CREStE: Scalable Mapless Navigation with Internet Scale Priors and Counterfactual Guidance

Zhang, Arthur, Sikchi, Harshit, Zhang, Amy, Biswas, Joydeep

arXiv.org Artificial IntelligenceMar-5-2025

We address the long-horizon mapless navigation problem: enabling robots to traverse novel environments without relying on high-definition maps or precise waypoints that specify exactly where to navigate. Achieving this requires overcoming two major challenges -- learning robust, generalizable perceptual representations of the environment without pre-enumerating all possible navigation factors and forms of perceptual aliasing and utilizing these learned representations to plan human-aligned navigation paths. Existing solutions struggle to generalize due to their reliance on hand-curated object lists that overlook unforeseen factors, end-to-end learning of navigation features from scarce large-scale robot datasets, and handcrafted reward functions that scale poorly to diverse scenarios. To overcome these limitations, we propose CREStE, the first method that learns representations and rewards for addressing the full mapless navigation problem without relying on large-scale robot datasets or manually curated features. CREStE leverages visual foundation models trained on internet-scale data to learn continuous bird's-eye-view representations capturing elevation, semantics, and instance-level features. To utilize learned representations for planning, we propose a counterfactual-based loss and active learning procedure that focuses on the most salient perceptual cues by querying humans for counterfactual trajectory annotations in challenging scenes. We evaluate CREStE in kilometer-scale navigation tasks across six distinct urban environments. CREStE significantly outperforms all state-of-the-art approaches with 70% fewer human interventions per mission, including a 2-kilometer mission in an unseen environment with just 1 intervention; showcasing its robustness and effectiveness for long-horizon mapless navigation. For videos and additional materials, see https://amrl.cs.utexas.edu/creste .

demonstration, navigation, representation, (15 more...)

arXiv.org Artificial Intelligence

2503.03921

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > Promising Solution (0.48)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)
(4 more...)

Add feedback

PALMER: Perception - Action Loop with Memory for Long-Horizon Planning

Neural Information Processing SystemsJan-19-2025, 02:17:55 GMT

To achieve autonomy in a priori unknown real-world scenarios, agents should be able to: i) act from high-dimensional sensory observations (e.g., images), ii) learn from past experience to adapt and improve, and iii) be capable of long horizon planning. PRM, RRT) are proficient at handling long-horizon planning. Deep learning based methods in turn can provide the necessary representations to address the others, by modeling statistical contingencies between observations. In this direction, we introduce a general-purpose planning algorithm called PALMER that combines classical sampling-based planning algorithms with learning-based perceptual representations. For training these perceptual representations, we combine Q-learning with contrastive representation learning to create a latent space where the distance between the embeddings of two states captures how easily an optimal policy can traverse between them.

perceptual representation, planning algorithm, representation, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.62)

Add feedback

LangNav: Language as a Perceptual Representation for Navigation

Pan, Bowen, Panda, Rameswar, Jin, SouYoung, Feris, Rogerio, Oliva, Aude, Isola, Phillip, Kim, Yoon

arXiv.org Artificial IntelligenceOct-11-2023

We explore the use of language as a perceptual representation for vision-and-language navigation. Our approach uses off-the-shelf vision systems (for image captioning and object detection) to convert an agent's egocentric panoramic view at each time step into natural language descriptions. We then finetune a pretrained language model to select an action, based on the current view and the trajectory history, that would best fulfill the navigation instructions. In contrast to the standard setup which adapts a pretrained language model to work directly with continuous visual features from pretrained vision models, our approach instead uses (discrete) language as the perceptual representation. We explore two use cases of our language-based navigation (LangNav) approach on the R2R vision-and-language navigation benchmark: generating synthetic trajectories from a prompted large language model (GPT-4) with which to finetune a smaller language model; and sim-to-real transfer where we transfer a policy learned on a simulated environment (ALFRED) to a real-world environment (R2R). Our approach is found to improve upon strong baselines that rely on visual features in settings where only a few gold trajectories (10-100) are available, demonstrating the potential of using language as a perceptual representation for navigation tasks.

langnav, navigation, perceptual representation

arXiv.org Artificial Intelligence

2310.07889

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

Biologically Plausible Learning Rules for Perceptual Systems that Maximize Mutual Information

Liu, Tao

arXiv.org Artificial IntelligenceSep-7-2021

Consider a neural perceptual system being exposed to an external environment. The system has certain internal state to represent external events. There is strong behavioral and neural evidence(e.g., Ernst and Banks, 2002; Gabbiani and Koch, 1998) that the internal representation is intrinsically probabilistic(Knill and Pouget, 2004), in line with the statistical properties of the environment. We mark the input signal as x. The perceptual representation would be a probability distribution conditional on x, denoted as p(y x). According to the Infomax principle (Attneave, 1954; Barlow et al., 1961; Linsker, 1988), the system's goal is to maximize the mutual information (MI) between the input x and the output (neuronal response) y, which can be written as max I(x;y), (1.1)

algorithm, information, probability, (16 more...)

arXiv.org Artificial Intelligence

2109.13102

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

An Information-Theoretic Framework for Understanding Saccadic Eye Movements

Lee, Tai Sing, Yu, Stella X.

Neural Information Processing SystemsDec-31-2000

Are there rules and principles that govern where the eyes are going to look next at each moment? In this paper, we sketch a theoretical framework based on information maximization to reason about the organization of saccadic eye movements.

hypercolumn, information, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

An Information-Theoretic Framework for Understanding Saccadic Eye Movements

Lee, Tai Sing, Yu, Stella X.

Neural Information Processing SystemsDec-31-2000

hypercolumn, information, representation, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

An Information-Theoretic Framework for Understanding Saccadic Eye Movements

Lee, Tai Sing, Yu, Stella X.

Neural Information Processing SystemsDec-31-2000

In this paper, we propose that information maximization can provide aunified framework for understanding saccadic eye movements. Inthis framework, the mutual information among the cortical representations of the retinal image, the priors constructed from our long term visual experience, and a dynamic short-term internal representation constructed from recent saccades provides a map for guiding eye navigation. By directing the eyes to locations ofmaximum complexity in neuronal ensemble responses at each step, the automatic saccadic eye movement system greedily collects information about the external world, while modifying the neural representations in the process. This framework attempts to connect several psychological phenomena, such as pop-out and inhibition of return, to long term visual experience and short term working memory. It also provides an interesting perspective on contextual computation and formation of neural representation in the visual system. 1 Introduction When we look at a painting or a visual scene, our eyes move around rapidly and constantly to look at different parts of the scene.

artificial intelligence, hypercolumn, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback